Unsupervised Grammar Inference Systems for Natural Language

نویسندگان

  • Andrew Roberts
  • Eric Atwell
چکیده

In recent years there have been significant advances in the field of Unsupervised Grammar Inference (UGI) for Natural Languages such as English or Dutch. This paper presents a broad range of UGI implementations, where we can begin to see how the theory has been put in to practise. Several mature systems are emerging, built using complex models and capable of deriving natural language grammatical phenomena. The range of systems is classified into: models based on Categorial Grammar (GraSp, CLL, EMILE); Memory Based Learning models (FAMBL, RISE); Evolutionary computing models (ILM, LAgts); and string-pattern searches (ABL, GB). An objectively measurable statistical comparison of performance Of the systems reviewed is not yet feasible. However, their merits and shortfalls are discussed, as well as a look at what the future has in store for UGI.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Unsupervised Grammar Inference Using the Minimum Description Length Principle

Context Free Grammars (CFGs) are widely used in programming language descriptions, natural language processing, compilers, and other areas of software engineering where there is a need for describing the syntactic structures of programs. Grammar inference (GI) is the induction of CFGs from sample programs and is a challenging problem. We describe an unsupervised GI approach which uses simplicit...

متن کامل

Unsupervised Learning for Natural Language Processing

Given the abundance of text data, unsupervised approaches are very appealing for natural language processing. We present three latent variable systems which achieve state-of-the-art results in domains previously dominated by fully supervised systems. For syntactic parsing, we describe a grammar induction technique which begins with coarse syntactic structures and iteratively refines them in an ...

متن کامل

Unsupervised NLP and Human Language Acquisition: Making Connections to Make Progress

Natural language processing and cognitive science are two fields in which unsupervised language learning is an important area of research. Yet there is often little crosstalk between the two fields. In this talk, I will argue that considering the problem of unsupervised language learning from a cognitive perspective can lead to useful insights for the NLP researcher, while also showing how tool...

متن کامل

Grammatical Inference with Grammar-based Classifier System

This paper takes up the topic of a task of training Grammar-based Classifier System (GCS) to learn grammar from data. GCS is a new model of Learning Classifier Systems in which the population of classifiers has a form of a context-free grammar rule set in a Chomsky Normal Form. GCS has been proposed to address both the natural language grammar induction and also learning formal grammar for DNA ...

متن کامل

Grammar-based Classifier System: A Universal Tool for Grammatical Inference

Grammatical Inference deals with the problem of learning structural models, such as grammars, from different sort of data patterns, such as artificial languages, natural languages, biosequences, speech and so on. This article describes a new grammatical inference tool, Grammar-based Classifier System (GCS) dedicated to learn grammar from data. GCS is a new model of Learning Classifier Systems i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002